Risk Minimization and Language Modeling in Text Retrieval
نویسنده
چکیده
With the dramatic increase in online information in recent years, text retrieval is becoming increasingly important. Although many different text retrieval approaches have been proposed and studied in the past decades, it is still a significant scientific challenge to develop principled retrieval approaches that also perform well empirically; so far, the theoretically well-motivated models have rarely led to good performance directly. It is also a great challenge in retrieval to develop models that may go beyond the traditional notion of topical relevance and capture more user factors, such as topical redundancy and sub-topic diversity.
منابع مشابه
Risk Minimization and Language Modeling in Text Retrieval – Thesis Summary
This thesis presents a new general probabilistic framework for text retrieval based on Bayesian decision theory. In this framework, queries and documents are modeled using statistical language models, user preferences are modeled through loss functions, and retrieval is cast as a risk minimization problem. This risk minimization framework not only unifies several existing retrieval models withi...
متن کاملUsing Text Surrounding Method to Enhance Retrieval of Online Images by Google Search Engine
Purpose: the current research aimed to compare the effectiveness of various tags and codes for retrieving images from the Google. Design/methodology: selected images with different characteristics in a registered domain were carefully studied. The exception was that special conceptual features have been apportioned for each group of images separately. In this regard, each group image surr...
متن کاملStudying the Effect of Retrieval Direction during Reading on Productive and Receptive Knowledge of Vocabulary
Retrieval tasks provide learners with an opportunity to focus both on meaning and on form. There are four different retrieval directions. The present study aimed to identify the optimal direction of recall type retrievals during reading and to investigate the outcomes of each one. Forty-eight intermediate EFL learners took part in the study. One of the experimental groups was provided with the ...
متن کاملA risk minimization framework for information retrieval
This paper presents a novel probabilistic information retrieval framework in which the retrieval problem is formally treated as a statistical decision problem. In this framework, queries and documents are modeled using statistical language models (i.e., probabilistic models of text), user preferences are modeled through loss functions, and retrieval is cast as a risk minimization problem. We di...
متن کاملUsing Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002